Action Recognition using Visual Attention
نویسندگان
چکیده
We propose a soft attention based model for the task of action recognition in videos. We use multi-layered Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units which are deep both spatially and temporally. Our model learns to focus selectively on parts of the video frames and classifies videos after taking a few glimpses. The model essentially learns which parts in the frames are relevant for the task at hand and attaches higher importance to them. We evaluate the model on UCF-11 (YouTube Action), HMDB-51 and Hollywood2 datasets and analyze how the model focuses its attention depending on the scene and the action being performed.
منابع مشابه
Transfer from action to perception: The effect of motor-perceptual enrichment
This study investigated the effect of audiovisual integration on action-perception transfer.40 subjects were randomly divided four groups: visual, visual-auditory, control visual and control visual-auditory. Visual groups watched pattern skilled basketball player and other groups in addition to watching pattern skilled basketball player, heard Elbow angular velocity as sonification. In first st...
متن کاملAction Classification and Highlighting in Videos
Inspired by recent advances in neural machine translation, that jointly align and translate using encoder-decoder networks equipped with attention, we propose an attentionbased LSTM model for human activity recognition. Our model jointly learns to classify actions and highlight frames associated with the action, by attending to salient visual information through a jointly learned soft-attention...
متن کاملOnline learning of task-driven object-based visual attention control
We propose a biologically-motivated computational model for learning task-driven and object-based visual attention control in interactive environments. In this model, top-down attention is learned interactively and is used to search for a desired object in the scene through biasing the bottom-up attention in order to form a need-based and object-driven state representation of the environment. O...
متن کاملCortex-inspired Recurrent Networks for Developmental Visual Attention and Recognition
Cortex-inspired Recurrent Networks for Developmental Visual Attention and Recognition By Matthew Luciw It is unknown how the brain self-organizes its internal wiring without a holisticallyaware central controller. How does the brain develop internal object representations for a massive number of objects? How do such representations enable tightly intertwined attention and recognition in the pre...
متن کاملEarly Posterior Negativity as Facial Emotion Recognition Index in Children With Attention Deficit Hyperactivity Disorder
Introduction: Studies indicate that children with Attention Deficit Hyperactivity Disorder (ADHD) have deficits in social and emotional functions. It can be hypothesized that these children have some deficits in early stages of facial emotion discrimination. Based on this hypothesis, the present study investigated neural correlates of early visual processing during emotional face recognition in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1511.04119 شماره
صفحات -
تاریخ انتشار 2015